Search CORE

32 research outputs found

Processor Allocation for Optimistic Parallelization of Irregular Programs

Author: A. Braunstein
D. Freedman
F. Versaci
H.D. Friedman
J. Jensen
J. Reinders
K. Agrawal
K. Georgiou
K. Pingali
L.J. Guibas
L.S. Blackford
M. Frigo
M. Püschel
P. An
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Optimistic parallelization is a promising approach for the parallelization of irregular algorithms: potentially interfering tasks are launched dynamically, and the runtime system detects conflicts between concurrent activities, aborting and rolling back conflicting tasks. However, parallelism in irregular algorithms is very complex. In a regular algorithm like dense matrix multiplication, the amount of parallelism can usually be expressed as a function of the problem size, so it is reasonably straightforward to determine how many processors should be allocated to execute a regular algorithm of a certain size (this is called the processor allocation problem). In contrast, parallelism in irregular algorithms can be a function of input parameters, and the amount of parallelism can vary dramatically during the execution of the irregular algorithm. Therefore, the processor allocation problem for irregular algorithms is very difficult. In this paper, we describe the first systematic strategy for addressing this problem. Our approach is based on a construct called the conflict graph, which (i) provides insight into the amount of parallelism that can be extracted from an irregular algorithm, and (ii) can be used to address the processor allocation problem for irregular algorithms. We show that this problem is related to a generalization of the unfriendly seating problem and, by extending Tur\'an's theorem, we obtain a worst-case class of problems for optimistic parallelization, which we use to derive a lower bound on the exploitable parallelism. Finally, using some theoretically derived properties and some experimental facts, we design a quick and stable control strategy for solving the processor allocation problem heuristically.Comment: 12 pages, 3 figures, extended version of SPAA 2011 brief announcemen

arXiv.org e-Print Archive

Crossref

Computational Nuclear Physics and Post Hartree-Fock Methods

Author: A. Baran
A. Carbone
A. Ekström
A. Ekström
A.W. Steiner
A.W. Steiner
B.D. Carlsson
B.D. Day
B.H. Brandow
C. Gros
C. Lanczos
C. Lin
C.J. Horowitz
D. Thompson
E. Epelbaum
E. Epelbaum
E.R. Davidson
E.R. Davidson
F. Coester
F. Sammarruca
G. Baardsen
G. Golub
G. Hagen
G. Hagen
G. Hagen
G. Hagen
G.R. Jansen
H. Heiselberg
H. Hergert
H. Kümmel
I. Shavitt
J. Carlson
J. Carlson
J.J. Shepherd
J.M. Lattimer
J.P. Blaizot
K.A. Brueckner
K.A. Brueckner
L. Coraggio
L. Coraggio
L.S. Blackford
M. Baldo
M. Baldo
M. Prakash
P. Navratil
P. Navrátil
R. Machleidt
R.C. Martin
R.D. Mattuck
R.J. Bartlett
R.J. Bartlett
S. Binder
S. Binder
S. Weinberg
S. Weinberg
S.L. Shapiro
T. Inoue
T.D. Morris
U. Kolck van
V. Somà
W. Dickhoff
W. Gropp
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 21/11/2016
Field of study

We present a computational approach to infinite nuclear matter employing Hartree-Fock theory, many-body perturbation theory and coupled cluster theory. These lectures are closely linked with those of chapters 9, 10 and 11 and serve as input for the correlation functions employed in Monte Carlo calculations in chapter 9, the in-medium similarity renormalization group theory of dense fermionic systems of chapter 10 and the Green's function approach in chapter 11. We provide extensive code examples and benchmark calculations, allowing thereby an eventual reader to start writing her/his own codes. We start with an object-oriented serial code and end with discussions on strategies for porting the code to present and planned high-performance computing facilities.Comment: 82 pages, to appear in Lecture Notes in Physics (Springer), "An advanced course in computational nuclear physics: Bridging the scales from quarks to neutron stars", M. Hjorth-Jensen, M. P. Lombardo, U. van Kolck, Editor

arXiv.org e-Print Archive

Crossref

Parallel computation of 3-D soil-structure interaction in time domain with a coupled FEM/SBFEM approach

Author: B. Engquist
C. Petersen
D. Appelö
D. Guerrero
D. Kleinman
E. Anderson
Enrique S. Quintana-Ortí
H. Antes
H. Hilbert
J. Lysmer
J. Roberts
J. Wolf
J. Wolf
Jose E. Roman
K. Meskouris
L. Lehmann
L. Lehmann
L.S. Blackford
M. Schauer
M.E. Harr
Marco Schauer
N. Newmark
P. Benner
P. Benner
P. Benner
P. Bettess
R. Granat
R.J. Astley
Sabine Langer
Z.P. Liao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

The final publication is available at Springer via http://dx.doi.org/10.1007/s10915-011-9551-xThis paper introduces a parallel algorithm for the scaled boundary finite element method (SBFEM). The application code is designed to run on clusters of computers, and it enables the analysis of large-scale soil-structure-interaction problems, where an unbounded domain has to fulfill the radiation condition for wave propagation to infinity. The main focus of the paper is on the mathematical description and numerical implementation of the SBFEM. In particular, we describe in detail the algorithm to compute the acceleration unit impulse response matrices used in the SBFEM as well as the solvers for the Riccati and Lyapunov equations. Finally, two test cases validate the new code, illustrating the numerical accuracy of the results and the parallel performances. © Springer Science+Business Media, LLC 2011.Jose E. Roman and Enrique S. Quintana-Orti were partially supported by the Spanish Ministerio de Ciencia e Innovacion under grants TIN2009-07519, and TIN2008-06570-C04-01, respectively.Schauer, M.; Román Moltó, JE.; Quintana Orti, ES.; Langer, S. (2012). Parallel computation of 3-D soil-structure interaction in time domain with a coupled FEM/SBFEM approach. Journal of Scientific Computing. 52(2):446-467. doi:10.1007/s10915-011-9551-xS446467522Anderson, E., Bai, Z., Bischof, C., Demmel, J., Dongarra, J., Croz, J.D., Greenbaum, A., Hammarling, S., McKenney, A., Sorensen, D.: LAPACK User’s Guide. Society for Industrial and Applied Mathematics, Philadelphia (1992)Antes, H., Spyrakos, C.: Soil-structure interaction. In: Beskos, D., Anagnotopoulos, S. (eds.) Computer Analysis and Design of Earthquake Resistant Structures, p. 271. Computational Mechanics Publications, Southampton (1997)Appelö, D., Colonius, T.: A high-order super-grid-scale absorbing layer and its application to linear hyperbolic systems. J. Comput. Phys. 228(11), 4200–4217 (2009)Astley, R.J.: Infinite elements for wave problems: a review of current formulations and a assessment of accuracy. Int. J. Numer. Methods Eng. 49(7), 951–976 (2000)Balay, S., Buschelman, K., Eijkhout, V., Gropp, W.D., Kaushik, D., Knepley, M., McInnes, L.C., Smith, B.F., Zhang, H.: PETSc users manual. Tech. Rep. ANL-95/11 - Revision 3.1, Argonne National Laboratory (2010)Benner, P.: Contributions to the numerical solution of algebraic Riccati equations and related eigenvalue problems. Dissertation, Fak. f. Mathematik, TU Chemnitz–Zwickau, Chemnitz, FRG (1997)Benner, P.: Numerical solution of special algebraic Riccati equations via an exact line search method. In: Proc. European Control Conf. ECC 97, Paper 786, BELWARE Information Technology, Waterloo (B) (1997)Benner, P., Quintana-Ortí, E.: Solving stable generalized Lyapunov equations with the matrix sign function. Numer. Algorithms 20(1), 75–100 (1999)Benner, P., Byers, R., Quintana-Ortí, E., Quintana-Ortí, G.: Solving algebraic Riccati equations on parallel computers using Newton’s method with exact line search. Parallel Comput. 26(10), 1345–1368 (2000)Benner, P., Quintana-Ortí, E.S., Quintana-Ortí, G.: Solving linear-quadratic optimal control problems on parallel computers. Optim. Methods Softw. 23(6), 879–909 (2008)Bettess, P.: Infinite Elements. Penshaw Press, Sunderland (1992)Blackford, L.S., Choi, J., Cleary, A., D’Azevedo, E., Demmel, J., Dhillon, I., Dongarra, J., Hammarling, S., Henry, G., Petitet, A., Stanley, K., Walker, D., Whaley, R.C.: ScaLAPACK Users’ Guide. Society for Industrial and Applied Mathematics, Philadelphia (1997)Borsutzky, R.: Braunschweiger Schriften zur Mechanik - Seismic Risk Analysis of Buried Lifelines, vol. 63. Mechanik-Zentrum Technische Universität. Braunschweig (2008)Dongarra, J.J., Whaley, R.C.: LAPACK working note 94: A user’s guide to the BLACS v1.1. Tech. Rep. UT-CS-95-281, Department of Computer Science, University of Tennessee (1995)Engquist, B., Majda, A.: Absorbing boundary conditions for the numerical simulation of waves. Math. Comput. 31(139), 629–651 (1977)Granat, R., Kågström, B.: Algorithm 904: The SCASY library – parallel solvers for Sylvester-type matrix equations with applications in condition estimation, part II. ACM Trans. Math. Softw. 37(3), 33:1–33:4 (2010)Guerrero, D., Hernández, V., Román, J.E.: Parallel SLICOT model reduction routines: The Cholesky factor of Grammians. In: Proceedings of the 15th Triennal IFAC World Congress, Barcelona, Spain (2002)Harr, M.E.: Foundations of Theoretical Soil Mechanics. McGraw-Hill, New York (1966)Hilbert, H., Hughes, T., Taylor, R.: Improved numerical dissipation for time integration algorithms in structural dynamics. Earthquake Eng. Struct. Dyn. 5, 283 (1977)Kleinman, D.: On an iterative technique for Riccati equation computations. IEEE Trans. Autom. Control AC-13, 114–115 (1968)Lehmann, L.: Wave Propagation in Infinite Domains. Springer, Berlin (2006)Lehmann, L., Langer, S., Clasen, D.: Scaled boundary finite element method for acoustics. J. Comput. Acoust. 14(4), 489–506 (2006)Liao, Z.P., Wong, H.L.: A transmitting boundary for the numerical simulation of elastic wave propagation. Soil Dyn. Earthq. Eng. 3(4), 174–183 (1984)Lysmer, J., Kuhlmeyer, R.L.: Finite dynamic model for infinite media. J. Eng. Mech. 95, 859–875 (1969)Meskouris, K., Hinzen, K.G., Butenweg, C., Mistler, M.: Bauwerke und Erdbeben - Grundlagen - Anwendung - Beispiele. Vieweg Teubner, Wiesbaden (2007)MPI Forum: The message passing interface (MPI) standard (1994). http://www.mcs.anl.gov/mpiNewmark, N.: A method of computation for structural dynamics. J. Eng. Mech. Div. 85, 67 (1959)Petersen, C.: Dynamik der Baukonstruktionen. Vieweg/Sohn Verlagsgesellschaft, Braunschweig (2000)Roberts, J.: Linear model reduction and solution of the algebraic Riccati equation by use of the sign function. Int. J. Control 32, 677–687 (1980)Schauer, M., Lehmann, L.: Large scale simulation with scaled boundary finite element method. Proc. Appl. Math. Mech. 9, 103–106 (2009)Wolf, J.: The Scaled Boundary Finite Element Method. Wiley, Chichester (2003)Wolf, J., Song, C.: Finite-Element Modelling of Unbounded Media. Wiley, Chichester (1996

Crossref

Repositori Institucional de la Universitat Jaume I

RiuNet

Integrating Scientific Software Libraries in Problem Solving Environments: A Case Study with ScaLAPACK

Author: L.S. Blackford
M.R. Guarracino
T. Sterling
W. Gropp
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2004
Field of study

Crossref

High performance linear algebra package for FORTRAN 90

Author: C.H. Koelbel
E. Anderson
J. Dongarra
L.S. Blackford
M. Metcalf
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Compatibility of Scalapack with the Discrete Wavelet Transform

Author: I. Daubechies
L.S. Blackford
M.V. Wickerhauser
R. Barrett
T.F. Chan
T.F. Chan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

Crossref

Incomplete cyclic reduction of banded and strictly diagonally dominant linear systems

Author: C.C.K. Mikkelsen
C.C.K. Mikkelsen
D. Heller
L.S. Blackford
P. Arbenz
R.W. Hockney
S.K. Lele
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

The ScaLAPACK library contains a pair of routines for solving banded linear systems which are strictly diagonally dominant by rows. Mathematically, the algorithm is complete block cyclic reduction corresponding to a particular block partitioning of the system. In this paper we extend Heller’s analysis of incomplete cyclic reduction for block tridiagonal systems to the ScaLAPACK case. We obtain a tight estimate on the significance of the off diagonal blocks of the tridiagonal linear systems generated by the cyclic reduction algorithm. Numerical experiments illustrate the advantage of omitting all but the first reduction step for a class of matrices related to high order approximations of the Laplace operator

Crossref

Publikationer från Umeå universitet

Digitala Vetenskapliga Arkivet - Academic Archive On-line

VMAD: An Advanced Dynamic Program Analysis and Instrumentation Framework

Author: C. Jaramillo
J.R. Larus
L.S. Blackford
M. Arnold
M. Kim
V. Aslot
X. Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Crossref

Sparse matrices in Matlab*P: Design and implementation

Author: D.E. Culler
G.E. Blelloch
J.R. Gilbert
J.R. Gilbert
L.S. Blackford
R.B. Lehoucq
X.S. Li
Publication venue: Springer
Publication date: 01/01/2004
Field of study

Abstract. Matlab*P is a flexible interactive system that enables computational scientists and engineers to use a high-level language to program cluster computers. The Matlab*P user writes code in the Matlab language. Parallelism is available via data-parallel operations on distributed objects and via task-parallel operations on multiple objects. Matlab*P can store distributed matrices in either full or sparse format. As in Matlab, most matrix operations apply equally to full or sparse operands. Here, we describe the design and implementation of Matlab*P’s sparse matrix support, and an application to a problem in computational fluid dynamics

CiteSeerX

Crossref

Parallel solution of narrow banded diagonally dominant linear systems

Author: C.C.K. Mikkelsen
D. Heller
E. Polizzi
E. Polizzi
L.S. Blackford
P. Arbenz
R.W. Hockney
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

ScaLAPACK contains a pair of routines for solving systems which are narrow banded and diagonally dominant by rows. Mathematically, the algorithm is block cyclic reduction. The ScaLAPACK implementation can be improved using incomplete, rather than complete block cyclic reduction. If the matrix is strictly dominant by rows, then the truncation error can be bounded directly in terms of the dominance factor and the size of the partitions. Our analysis includes new results applicable in our ongoing work of developing an efficient parallel solver

Crossref

Publikationer från Umeå universitet

Digitala Vetenskapliga Arkivet - Academic Archive On-line